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1.   INTRODUCTION 

Recent  progress  in  hardware  technology  and  computer  architecture 
has  led  to  the  design  and  construction  of  computer  systems  that  contains 
a  large  number  of  processors.   Because  of  their  capability  of  executing 
several  tasks  simultaneously,  the  problem  of  job  scheduling  in  a  multi- 
processor system  is  of  both  theoretical  and  practical  interest.   Several 
authors  have  designed  scheduling  algorithms  to  produce  schedules  which 
minimize  the  total  execution  time  of  a  given  set  of  tasks  and  thus  achieve 
optimal  utilization  of  the  processors  [1-3].   Unfortunately,  such 
algorithms  are  known  only  for  some  special  uses.   Furthermore,  in  many 
instances,  the  algorithm  that  produces  optimal  schedules  is  so  complex  that 
the  reduction  in  the  total  execution  time  of  a  set  of  tasks  is  offset  by 
the  computation  time  required  to  determine  an  optimal  schedule.   For  this 
reason,  simple  algorithms  that  produce  only  sub-optimal  schedules  are  often 
used  in  practice.   Such  a  choice  becomes  even  more  appealing  when  the 
performance  of  the  simple  algorithms  producing  sub- optimal  schedules  can 
be  compared  quantitatively  with  that  of  algorithms  producing  optimal 
schedules.   For  this  reason,  lower  bounds  on  the  performance  of  simple 
algorithms  have  been  studied  extensively  [^,  5]. 

In  previous  works  on  job  scheduling,  a  multiprocessor  computing 
system  is  modelled  as  one  containing  identical  processors.   We  introduce  here 
a  more  general  model  in  which  different  processors  have  different  computation 
speeds.   This  model  is  a  realistic  one  when  we  consider  the  possibilities  of 
replacing  one  or  more  of  the  processors  in  an  existing  system  by  faster 
processors  and  of  interconnecting  different  computers  in  an  installation. 
It  will  be  described  precisely  in  Section  2. 


In  Section  3}    lower  bounds  on  the  performance  of  a  class  of  simple 
nonpreemptive  algorithms  are  derived.   These  results  will  also  provide 
us  with  information  concerning  the  relative  merit  of  different  computing 
systems  and  the  trade-off  between  the  speeds  and  the  number  of  processors 
in  a  multiprocessor  system. 

Algorithms  which  produce  preemptive  schedules  with  minimum  total 
execution  time  have  been  found  in  some  special  cases  [3]  for  systems 
containing  identical  processors.  Algorithms  which  produce  optimal 
schedules  of  independent  tasks  when  the  multiprocessor  systems  contain 
different  processors  are  described  in  Section  k.      The  performance  of 
preemptive  scheduling  algorithms  is  compared  to  that  of  nonpreemptive 
scheduling  algorithms  studied  in  Section  3. 

An  algorithm  which  produces  schedules  with  minimum  mean  flow  time 
is  described  in  Section  5.   Performances  of  different  computing  systems, 
using  mean  flow  time  as  a  criterion,  are  compared. 


2.   GENERAL  MODEL 

2.1  A  Model  of  Heterogeneous  Computing  System 

By  a  heterogeneous  computing  system,  we  mean  a  multiprocessor 
system  in  which  processors  have  different  computing  speeds.  We  measure 
the  speed  of  a  processor  against  that  of  a  "standard"  processor  whose 
speed  is  considered  as  1.   The  speed  of  a  processor  is  said  to  be  To  if 
it  is  b  times  as  fast  as  a  standard  processor.  Without  loss  of  generality, 
we  shall  assume  that  b  >  1  throughout  our  discussion.   Let  us  denote  a 
multiprocessor  system  which  contains  n_  processors  of  speed  b..,  n„ 
processors  of  speed  bp,  . ..,  n,  processors  of  speed  b  -.by  (P   =  (n  ,  n  ,  . .., 
n,  ;  b  ,  b  ,  . ..,  b  ).   Furthermore,  the  N  processors  will  be  referred  to 
individually  as  processors  P.  for  i  =  1,  2,  . ..,  N  where  N  =  n.  +  n  +  . . .  + 
n.   In  particular,  the  n  processors  of  speed  b.  are  referred  to  as  P  , 
P  ,  ...,  P  ;  the  np  processors  of  speed  b  are  referred  to  as 

Pn  +1'  Pn  +2>  •*•'  Pn  +n  '  etc*   For  examPle>  the  system  (P  =   (l,  3;  2,  l) 
contains  four  processors  P  ,  P  ,  P  ,  and  P,  whose  speeds  are  2,  1,  1,  and  1, 
respectively. 

2.2  Definitions  and  Notations 

Let  if  -  {T  ,  T  ,  ...,  T  }  denote  a  set  of  tasks  to  be  processed  by 

a  system  (P  .      The  execution  time  of  a  task  is  defined  as  the  time  required 

to  complete  the  task  on  a  standard  processor.   We  shall  denote  the  execution 

time  of  the  task  T.  by  ^(T.  )  where  |_i  is  a  function  from  ^  to  the  reals. 

In  other  words,  (i  is  a  function  that  specifies  the  execution  times  of 

the  tasks  in  Zf  .      Furthermore,  we  suppose  that  there  is  a  precedence 

relation  <  defined  over  ZT  .      That  T.  <  T.  (reads  T.  precedes  T.  or  T. 

1    J         I  J     J 

follows  T. )  shall  mean  that  T.  cannot  begin  to  be  executed  before  the 

J 


execution  of  T.  is  completed.  A  task  is  said  to  be  executable  at  a  certain 

time  if  the  execution  of  all  tasks  preceding  it  has  been  completed. 

Formally,  a  set  of  tasks  is  specified  by  an  ordered  triple  ( 7,  \i,   <). 

We  also  describe  a  set  of  tasks  {j>  ,    \±,    <)  by  a  directed  graph  whose 

vertices  correspond  to  the  tasks  and  are  labeled  by  the  names,  T.,  and 

J 

their  execution  times,  u(T.).   There  is  a  directed  edge  from  T.  to  T.  if 
T.  <  T..   For  example,  a  set  of  tasks  {&,   n,  <)  is  shown  in  Fi^ore  2-1 
where  V  =   {T^  Tg,  T  ,  T^,  T  }  and  ^  <  T^,  T^  <  T  ,  etc. 

By  scheduling  a  set  of  tasks  on  a  multiprocessor  system,  we  mean 
to  specify  for  each  task  the  time  interval  within  which  it  is  to  be 
executed  and  the  processor  on  which  execution  will  take  place.  A  schedule 
can  be  described  by  a  timing  diagram  (also  known  as  the  Gantt  chart)  such 
as  that  shown  in  Figure  2-2.   In  this  timing  diagram,  each  horizontal 
line  is  a  time  axis  and  its  subdivisions  give  the  sequence  of  tasks 
executed  on  a  processor  together  with  the  idle  periods  of  the  processor. 
The  idle  periods  are  the  time  intervals  within  which  the  processor  is  not 
executing  any  tasks.  We  use  cp  ,  cp  ,  ...  to  denote  idle  periods  of  the 
processors  as  shown  in  Figure  2-2.  With  a  slight  abuse  in  notation,  we 
use  u(cpu  ),  |J.(cpp),  ...  to  denote  the  lengths  of  .the  idle  periods. 

The  completion  time  of  a  schedule  is  the  total  time  it  takes  to 

t 
execute  all  the  tasks  according  to  the  schedule.   For  example,  the 

completion  time  of  the  set  of  tasks  (Z/f    u,  <)  shown  in  Figure  2-1  according 

to  the  schedule  shown  in  Figure  2-2  is  6.5.  When  the  completion  time  is 

used  as  a  criterion  for  comparing  different  scheduling  algorithms,  an 

optimal  schedule  for  a  given  set  of  tasks  (.7",  (a,  <)  is  one  with  the  minimal 

completion  time. 


Throughout  this  report,  we  assume  that  the  execution  of  a  set  of  tasks 
begin  at  t  =  0. 
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For  a  given  schedule,  the  flow  time  t.  of  a  particular  task  T. 

is  defined  to  be  the  time  at  which  its  execution  is  completed.   For 

example,  for  the  schedule  shown  in  Figure  2-2,  the  flow  time  of  tasks  T, 

and  T\  are  equal  to  1.5  and  2.5  respectively.   The  mean  flow  time  t  of  a 

schedule  for  a  set  of  tasks  (f7 ,  u,  <)  is  equal  to  the  sum  of  the  flow 

times  of  all  tasks  in  ^T  .   That  is, 

m 
t  =  Z  t. 

The  mean  flow  time  of  the  schedule  shown  in  Figure  2-2  is  15.   It  is  clear 
that  the  mean  flow  time  of  a  schedule  represents  the  average  waiting  time 
(turnaround  time)  of  the  tasks  in  t/"  .   This  parameter  will  also  be  used 
as  a  criterion  for  comparing  the  performance  of  different  scheduling 
algorithms. 

2.3  A  More  General  Model  of  Heterogeneous  Computing  Systems 

Let  us  note  that  the  model  of  heterogeneous  computing  systems 
described  in  Section  2.1  can  be  generalized  by  assuming  that  it  takes 
different  amounts  of  time  to  execute  a  task  on  different  processors  but 
the  difference  in  execution  times  is  not  uniform  among  different  tasks. 
For  example,  a  lengthy  computation  task  T  can  be  executed  on  processor 
P  ,  which  contains  a  fast  arithmetic  unit,  in  a  smaller  amount  of  time  than 
on  processor  Pp.   But  another  task  Tp  consisting  of  primarily  i/O  operations 
may  be  executed  in  a  smaller  amount  of  time  on  processor  Pp  instead.   For 
a  system  (P   containing  N  processors,  the  execution  times  of  the  tasks  can 
be  specified  by  N  functions  u_,  u  ,  ...,  u  where  u.(T. )  is  the  execution 
time  of  task  T.  on  processor  P..   No  general  result  is  known  for  this 
model  of  heterogeneous  computing  systems.   In  this  report,  we  shall  not 
be  concerned  with  this  model. 


3.   PRIORITY-DRIVEN  SCHEDULING  ALGORITHMS 

Throughout  this  section,  we  shall  use  completion  time  as  the 
criterion  for  comparison  of  the  performance  of  different  scheduling 
algorithms.  We  assume  that  once  a  task  is  assigned  to  a  processor,  it 
will  be  executed  until  completion„   That  is,  preemption  is  not  allowed. 
We  are  concerned,  in  particular,  with  the  class  of  non-preemptive 
scheduling  algorithms  that  never  leave  processors  idle  intentionally. 
(We  note  that  it  is  sometimes  necessary  to  leave  a  processor  idle 
intentionally  in  order  to  minimize  the  completion  time  of  a  set  of  tasks. ) 
In  other  words,  when  a  processor  becomes  free,  a  task  that  is  executable 
at  that  moment  is  scheduled.   These  algorithms  can  be  described  by 
assignments  of  priorities  to  the  tasks.  At  the  moment  a  processor  becomes 
free,  the  task  that  have  the  highest  priority  among  all  executable  tasks 
is  scheduled.   If  several  processors  are  free  at  the  same  time,  the  task 
is  executed  on  the  processor  P.  with  the  smallest  index  i  among  the  free 
processors.   Such  algorithms  are  known  as  priority-driven  scheduling 
algorithms.   Schedules  produced  by  such  algorithms  are  called  priority- 
driven  schedules.  As  an  example,  for  the  set  of  tasks  {Z7",    \i,    <)  shown 
in  Figure  2-1,  the  priority-driven  schedule  according  to  the  priority  list 
(T  ,  T„,  T  ,  T.  ,  T  )  on  the  system  (P  =   (l,  1;  2,  l)  is  that  shown  in 
Figure  2-2. 


t 
In  order  of  decreasing  priorities. 


3.1  Performance  of  Priority- Driven  Scheduling  Algorithm 

In  this  section,  we  establish  a  lower  bound  on  the  performance  of 
priority- driven  schedules  in  which  priorities  are  assigned  to  tasks  in  a 
completely  arbitrary  manner.   For  simplicity,  let  us  consider  first  a 
multiprocessor  system  (P   =  (l,  n;  b,  l),  that  is,  a  system  containing  a 
processor  P  of  speed  b  and  n  processors  Pp,  P„,  . ..,  P  ,  of  speed  1. 
We  have, 
Theorem  3.1 

Let  (J" ,    u,  <)  be  a  set  of  tasks  to  be  executed  on  the  system 
(P  =    (l,  n;  b,  1).   Let  w  and  to'  denote  the  completion  times  when  {y  ,    u,  <) 
is  executed  according  to  a  priority-driven  schedule  specified  by  the 
priority  list  L  and  an  arbitrary  schedule,  respectively.   Then, 

— .  <  b  + 


b+n 


Moreover,  this  bound  is  best  possible. 

Proof.   In  the  priority-driven  schedule  specified  by  the  priority 
list  L,  let  ZT-..,    denote  the  set  of  tasks  run  on  processor  P-  and  3C  denote 
the  set  of  tasks  run  on  processors  Pp,  P  ,  ...,  P  , .  Also,  let  0,  and  $ 
denote  the  corresponding  sets  of  idle  periods  of  processor  P  and  processors 
Pp,  P,  ...,  P  _,  respectively.   Let 

U,  =    Z    u(T  ) 


V     =    Z    u(T  ) 
i  •  ^2 


I±=        Z   u(cpi) 


I2  =    Z   u(cpi) 


We  have 


u-5a<T+ua  +  Ii  +  V 


1     r   1     2       b-1   /tt     _    \  12       b-1  _   _  /      .  \ 

[-1—  +  TT  (up+ip)  +  "IT-  +  IT  ^  (3-1) 


n+1   L     b  b      v  2     2y  b  b        1J 


Clearly, 


1  VU2 

w    >q^      ^       u(T  )  =— -  (3-2) 

n  D   T.G^  1  n   ° 

l 


Also 


u2+i2 

=    OJ 

n 


(3-3) 


To  bound  the  magnitudes  of  I  and  I_,  let  us  consider  an  idle  period 

cp  of  a  certain  processor,  Pn  .   Let  tn    and  t~  denote  the  times  at  which 
m  k       1      2 

cp  begins  and  ends.  We  have  the  following  observations:   (i)  if  the 
m 

execution  of  a  task  T.  on  another  processor  begins  at  t  where  t  <  t  <  tp, 

then  there  must  exist  a  task  T.  such  that  T.  <  T.  and  the  execution  of 

i  i    -J 

T.  ends  at  t.   Otherwise,  T.  would  have  been  executed  prior  to  t  on 
i  .) 

processor  P  .   (ii)  If  the  execution  of  a  task  T.  begins  at  t  where 

t0  <  t,  then  there  must  exist  a  task  T.  such  that  T.  <  T.  and  the  execution 

of  T.  ends  at  t0.   Combining  these  two  observations,  we  conclude  that  if 
l         2  ' 

9  ,  cp  ,  ...,  cp   are  the  idle  periods  of  a  certain  processor,  there 
exists  a  set  of  tasks  "6  -  (T.,,  T.0,    ...,  T.  .)  such  that 


(i)  t   <!•:... 


-u 


(ii)  at  any  moment  within  any  one  of  the  idle  periods  cp  ,  cp  , 

...,  9    one  of  the  tasks  in  iS   is  being  executed  on  another  processor, 
mr 

That  is, 


10 


2       n(T.  )  >       Z  n(cp,  ) 


Vt?       :     >,^fflVv...,U 


It  follows  that 


I1  <  b  u>»  (3-U) 


Moreover, 


Hence, 


n      E        n(T    )  >  I     +  I 

T^e     k      1    2 


Il+I2 

-^  <b  co'  (3-5) 

Substituting  this  inequality  and    (3-2)-(3-J+)  into  eq.    (3-1),   we  obtain, 

, ,      ,     1     rh+n      ,        b-1  .         /,     .,  N      ,  n 

w    <  — -   [— —  co'    +  — —  nco  +  nu'   +    (b-1)   co'l 

-  n+1   L   b  b  v        '        J 

which  simplifies  to 


-.   <  b   + 


oj'   -  b+n 

That  this  bound  is  the  best  possible  can  be  demonstrated  by  the 
following  example.   Consider  a  set  of  tasks  {j,    \i,    <)  where 
fTl'  T2,  ...,  T2n,  T2n+1), 


H(T±)  =  b  i  =  1,  2,  .  ..,  n-1 


M- 


(T  )  =  b+e 


n 


|j.(T.  )  =  n  i  =  n+1,  n+2,  ...,  2n 


^(T2n+l}  =  (n+b)b 
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with  e  being  a  positive  number  and  <  is  empty.   The  priority  list 

L=  ^1'    Tn+1'  Tn+2'  *••'  T2n'  T2>    V  '"'    V  T2n+1^ 

2 
yields  a  schedule  whose  completion  time  is  n  +  nb  +  b  for  very  small  e  as 

shown  in  Figure  3 -la.   On  the  other  hand,  the  priority  list 

L=  ^T2n+1'  Tn+1'  Tn+2>  '~>    T2n'  Tl'  T2>    ""    Tn^ 

yields  a  schedule  whose  completion  time  is  n  +  b  for  very  small  e  as  shown 
in  Figure  3-lb.  □ 

Let  us  observe  that  the  upper  bound  of  the  ratio  go/ go'  in  Theorem  3-1 
is  approximately  equal  to  b  +  1  for  large  n.   That  is,  even  when  there  is 
only  one  fast  processor  in  a  multiprocessor  system  with  many  standard 
processors,  the  worst  case  performance  of  an  arbitrary  scheduling  algorithm 
when  compared  to  that  of  an  optimal  scheduling  algorithm  still  depends 
primarily  on  the  speed  of  the  fast  processor.   For  a  fixed  n,  the  upper 
bound  of  oj/gj'  approaches  b  as  a  limit  when  b  increases.   This  fact  indicates 
that  a  priority-driven  schedule  may  become  very  inferior  for  large  b.  As  a 
matter  of  fact,  when  b  is  very  large,  better  performance  can  often  be 
assured  by  using  only  the  fast  processor  to  execute  all  the  tasks  in  -/  . 
In  that  case,  the  completion  time  approaches  the  minimum  completion  time 
as  a  limit  for  large  b.  We  shall  return  to  this  point  in  Sec.  3-3. 

The  result  in  Theorem  3.1  can  be  extended  immediately. 
Theorem  3.2 

Consider  a  multiprocessor  system  (P  =    (n  ,  n  ,  ...,  n. ;  b  b  ,  ..., 
b  ).   Let  go  and  go'  be  the  completion  times  of  a  set  of  tasks  (J" ',    \x,    <) 
when  it  is  run  on  (P    according  to  a  priority-driven  schedule  and  an 
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■  2 
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pl  I 
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Tn  +  ; 
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Figure  3-1 


Tn+2  ,  <fe 


T2  n  i  ^n 
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an  arbitrary  schedule,  respectively.   Suppose  that  b,  >  b„  >  ...  >  b  . 
We  have 

b. 


b 

co'  -  b         k 
Is. 


1 


Z  n.  b. 

.-,11 

1  =  1 


Moreover,  the  bound  is  the  best  possible. 

Proof.   In  the  priority- driven  schedule  specified  by  a  priority 

list  L,  let  U.  denote  the  sum  of  the  execution  times  of  all  tasks  executed 
l 

on  the  processors  of  speed  b..   Let  I.  denote  the  sum  of  the  lengths  of 
the  idle  periods  on  the  processors  of  speed  b..  We  have 


oo  = 


k 

Z     n. 


Ul       U2 


b^+Il+I2   +    '"    \ 


i=l 


Z     n. 

1  =  1 


•  k  k     b   -b.      U. 

±   z   u.  +    z    -±-l  Ur  +  i-) 

bn    .    ..      i        .  b_       vb.  i' 

1  i=l  i=l        1  i 


>       k  k-1  b.-b         ") 

-     Z     I.    +     Z     ~~ -  I. 
'l  i=l     x        i=l      bl        \J 


(3-6) 


Clearly 


k 

Z 
i=l 

U. 

l 

k 

Z 
i=l 

a.    b. 
l     i 

<    UJ1 


(3-7) 


and 


Ik 


u. 

: —  +1.  =  n.  w 

D.      1     1 

1 


(3-8) 


Using  the  same  argument  presented  in  the  proof  of  Theorem  3-1,  we  conclude 

that  the  total  length  of  the  idle  periods  of  any  processor  must  be  equal  to 

b  co' 

or  less  than  — .      Hence 

\ 


b.    I. 
k     l 


<  b.    oo' 


n.      -     1 

l 


(3-9) 


Moreover, 


Consequently, 


k  k  b  oo' 

Z     I.    <  (   Z     n.-l)  ~^— 
i=l  i=l  k 


k 

bn      Z     I. 

k    .    ,      l 
1=1 

k 

Z     n.  -1 

i=l     x 


<  b     co» 


(3-io) 


Substituting   (3-7)-(3-10)  into  eq.    (3-6),   we   obtain 


co  < 


1 


-     k 
Z     n. 
i=l     x 


Z     n.   b.  k     b  -b. 

i=l     1     1     oo «    +     Z     '■  ■.    X  n.    oo 
bn  i=l         i 


b     b         k 

+  r^rr  (   Z     n.-l)   oo- 

b_   b  \    _      i 

1     k  i=l 


k-1     b.-b       bn 

+     Z       -V^    >T     n-    "' 
i=l         bl       bk       x 
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Multiplying  both  sides  of  this  expression  by  b  b   Z  n.,  we  obtain 

1=1 

1=1  1=2 


i=l  i=l 


k-1 

+  b  Z   (b  -b  )  n  ]  u' 

i=1   i  k   i 


which  simplifies  to 


k  k  k 

(b,       Z     n.   b.  )   w  <   (b        Z     n.    b.    +  b        Z     n.    b.    -  b     b    )   w» 
k  .   1    l     i'       -      1  .,     i     i         k  .    ,     i     i         Ik 


That  is , 


2    <    1  +  1  .         1 
W  —  b,  k 

k 


Z     n.    b. 
1=1     X     X 


To  show  that  the  bound  is  best  possible,,  we  consider  a  set  of  tasks 
^/ ,   u,  <)  where  the  precedence  relation  <  is  empty  and 


^=  fTrpqlr  =  lf    2>    ••"  k'    q  =  -1'  2>    '">    N>  and 


p  =  1,  2,  ...,  nn-l  for  r   1  and  p  =  1,  2,  ...,    n^  for  r  /  1 

k-1 
U(Tjs  =  1,  2,    ...,  Z  n  +1} 

s  i=l  x 


The  execution  times  of  the  tasks  T    are 

rpq 
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rpq 


br  \  \ 


H(T    )  =  < 


b  Id  b 
r  k  2 


br  \  \-l 


r  k  k 


1  <  q  <  n! 


n  +  1  <  q  <  n.,  +  n_ 
1     -  H  -  1    2 


k-2  k-1 

E  n.+l  <  q  <   En. 

i=l  1=1 

k-1  k 

E  n.+l  <  q  <   En. 

1=1  i=l 


for  r  =  1,  2,  . ..,  k  and  p  =  1,    2,  ...,    n  -1  if  r  =  1  and  p 
if  r  /  1.   The  execution  times  of  the  tasks  T  are 


=  1,  2, 


n 


n(Tfl) 


k-1 
s  =  1,  2,    .  ..   E  n. 
i=l  ± 


^V  =  bl\  .zn  nibi 
i=l 


k-1 


where  e  is  a  small  positive  number  and  S  =  E  n.+l.   Let  us  consider  a 

i=l  1 
priority-driven  schedule  according  to  a  priority  list  which  assigns 

higher  priorities  to  the  tasks  T    than  the  tasks  T  .   Priorities  are 

rpq  s 

assigned  to  tasks  T    according  to  the  lexiographical  order  of  their 

rpq  * 

subscripts.   (That  is,  T  ^_  has  the  highest  priority  and  T-,-,p  has  the 

next  highest  priority,  etc.  )  The  priority  assignment  of  the  tasks  T  is 

s 

(T,,    Tp,    ...,    Tq).     We  obtain  the  schedule  in  Figure  3-2a  where 


CO 


=  b1  bk   (i^-l)  +  b2  b^  n2  +.. 


b.    b,    n.    +  t>_     E     n.  b 
k    k    k         1  i=1     i     i 


k 


=    (bn  +bn  )     E     n.    b.    -  b     b 
v  k     1     .    n      11  Ik 

i=l 
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However,  this  set  of  tasks  can  be  scheduled  as  shown  in  Figure  3-2b  where 

k 

w'  =  h,   Z  n.  b.  □ 

i=l 

Let  (P  =   (n  ,  n ,  . ..,  tcl;   b  ,   b  ,  . ..,  b  )  be  a  multiprocessor 

k 
system.   The  quantity  Z     n.  b.  can  be  considered  as  a  measure  of  the  total 

i=l 

k 
computational  capacity  of  the  system.   Indeed,  the  sum  E  n  b.  is  the 

i=l  ±     x 

maximal  throughput  of  the  system.  According  to  the  result  in  Theorem  3-2, 

the  worst  case  performance  of  a  priority- driven  schedule  depends  mainly 

on  the  ratio  of  speeds  of  the  slowest  processor  and  the  fastest  processors 

in  the  system.   For  systems  of  large  maximal  throughput,  the  bound  in 

bl 
Theorem  3-2  approaches  —  +  1.   This  implies  that  in  systems  where  the  speed 

k 
of  the  fastest  processors  is  significantly  larger  than  that  of  the  slowest 

processors,  the  worst  case  performance  of  a  priority-driven  schedule  may 

become  very  poor  in  comparison  with  that  of  an  optimal  schedule. 

3.2  Performance  of  Some  Heuristic  Scheduling  Algorithms 

Theorem  3.1  and  3.2  establish  a  lower  bound  on  the  performance  of 
priority-driven  scheduling  algorithms.   Between  the  two  extremes  of  using 
an  optimally  chosen  priority  list  and  of  using  a  completely  arbitrary 
priority  list,  there  is  the  possibility  of  using  priority  lists  obtained 
by  simple  heuristic  procedures.   Such  a  possibility  offers  a  good 
compromise  between  the  performance  of  the  resultant  scheduling  algorithm 
and  the  computational  cost  of  determining  the  priority  list.   In  this 
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section,  we  shall  consider  the  special  case  in  which  the  precedence  relation 
<  is  empty.  We  denote  such  a  set  of  tasks  by  ( J  ,  \i). 

Let  us  consider  the  problem  of  scheduling  a  set  of  tasks  (2T,  \i) 
on  the  multiprocessor  system  (p  =   (l,  n;  b,  l).   Suppose  that  we  choose  the  r 
longest  tasks  in  the  set  3     and  schedule  them  according  to  a  priority  list 
L  .   Let  oo  denote  the  corresponding  completion  time  when  they  are  executed 
on  (r  .   Let  L  denote  a  priority  list  obtained  by  appending  to  L  an 
assignment  of  priorities  to  the  remaining  tasks.   Let  oo  denote  the 
corresponding  completion  time.  When  oo  >  go,  we  have, 
Theorem  3.3 

"r     '  1 

7^<1  +  u- 


M  "'  (b+n)M 
where 


m  I    •      /Tr+1  rr+l       [b]        rbJ-lA 

M  =  max    mm       — st-t  >     — rrr    i_H-  -  j_t — \> 
\|n+pb]    '      n+[b]        b  b     J' 


r+1 
n+b 


and  co'  is  the  completion  time  according  to  an  arbitrary  schedule.   This 

bound  is  the  best  possible  when  b  is  an  integer  and  n+b  divides  r. 

Proof.   Let  Tn,  T~,  ....  T  denote  the  r  longest  tasks  and  T  n 
1 '      27  '   r  r+1 

denote  the  (r+l)st  longest  task  in  ^J .   Since  oo  >  go,  tasks  T,,  Tp,  ...,    T 

are  completed  before  t  =  co  .  Moreover,  since  each  processor  can  have  at 

most  one  idle  period  immediately  prior  to  t  =  w  ,  the  idle  period  of  the 

r 

fast  processor  is  equal  to  or  less  than  u(T  , ).   The  idle  period  of  any 

slow  processor  is  equal  to  or  less  than  —  u(T  , )  if  the  fast  processor  is 

not  idle  at  t  =  co  and  is  equal  to  or  less  than  li(T   )  if  the  fast 
processor  is  idle  at  t  =  w  .   Now, 
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1       Ul 
"r  "  S3  f~  +  U2  +  h  +  V 


m  [^  (W  +  ¥  (U2+I2) +  k  J2+Ii]  (3-11) 


where  I  ,  Ip,  U  and  U  are  as  defined  in  the  proof  of  Theorem  3.1.  We  note 
that  if  L  =  0 

J2  <  ^Tr+1> 


n  -   b 


and  if  I-,  ^  0, 


h  ±  "<W 


and 


X2 


— T  <  u(T   J 
n-1  -  KV  r+1 


Consequently, 


^I2  +  I1<max  {•%,   S^+  l)n(T   ) 

b 

,n±^(T    ) 
-   b   KV  r+1 

Thus  eq.  (3-11)  becomes 

.  1   rn+b   ,   b-1       n+b-1   /m    Nn 

w  <  — -  [— —  w'  +  ——  nw  +  — - —  |j.(T  ,  )] 

r  -  n+1  L  b        b    r     b   ^v  r+l/J 

Notice  that  in  any  schedule,  either   rriJ   of  the  r+1  tasks  T_,  T„,  ....  T 

|n+[b]|  1  2.'  '      r 

are  executed  on  a  slow  processor  or  rf  -.  1  [b]  -  ([b]-l)  of  them  are  executed 
on  the  fast  processor.   Consequently,  we  have 
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P#t]    n(T      ,)<(•)' 
n+[b] |    KV   r+ly  - 


(3-12) 


or 


Un+  bl        b  b     /  M'Ur+lj  ^ 


CO' 


(3-13) 


We  also  note  that 


(r+1)  n(Tp+1)  <   (n+b)  o>' 


or 


r+1 

n+b  K'VJ"r+l'  — 


^(T      J   < 


(3-1*0 


Combining  (3-12)  -   (3-1^+),  we  write 


M  ^(L i)  <  w' 


r+1 


where 


M  =  max  min 


(li 


r+1 


n+[b] 


r+1 


n+[b] 


[bj    |"b]-l\   r+1 
b      b  /'  n+b 


Equation  (3 -11)  becomes 


,     1      rn+b      ,        b-1  n+b-1      ,n 

co     <  — -   [— —  co '    +  — —  nto     +  co  •  1 

r  -  n+1   L  b  b  r  bM  J 


which  simplifies  to 


^       1     1 
co-  -     M  ~  (n+b  )M 


To  show  that  the  bound  is  best  possible,  let  us  note  that  for  b  and 


n+b 


being  integers,  the  bound  becomes 


r   rb+n+b (n+b ) 
^7  _   rb+n+b 


(3-15) 
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Let  tT  =   (T^  T2,  ...,  Tr+n(n+i)+1) 

and         n(T±)  =  (n+b)b      i  =  1,  2,  ...,  r,  r  +  1 

ji(T.  )  =  1  i  =  r  +  2,  ....,  r  +  n(n+b) 

n(T.  )  =  1  +  e       i  =  r+  n(n+b)  +  1 

where  e  is  a  small  positive  number.   The  tasks  T-,,  T ,  . ..,  T.  can  be 
scheduled  by  the  priority  list  (T, ,  Tp,  . ..,  T  ). ■  If  we  use  the  priority 
list 

(T^  T2,  .  ..,  T^  Tr+1,  T^2,  .  ..,  Tr+n+1,  Tr+n+2,  ..., 

T  ") 

r+n(n+b)+  I 

we  obtain  the  schedule  in  Figure  3-3a  with  completion  time 

w  =  rb  +  n  +  b 
If  we  use  the  priority  list 

(T^  Tp,  ...,  Tr,  Tr+2,  Tr+3,  ...,  Tr+n(n+b)^  Tr+n(n+b)+l>  Tr+1} 
the  completion  time  is 

oj  =  rb  +  n  +  b(n+b) 

as  shown  in  Figure  3-3b.  □ 

r 
When  b  and  — —  are  integers  and  when  the  maximal  throughput  of  the 

system,  n+b,  is  large,  the  upper  bound  of  — -  given  by  eq.  (3-15)  becomes 

U) 

approximately 


1  +  -    b 


1  +ZkTT 
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Figure  3-3 
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Hence,  the  ratio  of  w  to  the  minimum  completion  time  of  the  tasks  in  ^J   is 


upper  bounded  by 


1  + 


n+b 


when  the  minimum  completion  time  of  the  r  longest  tasks  is  less  than  w  . 

We  now  generalize  Theorem  3.3  to  the  problem  of  scheduling  a  set 

of  tasks  (J,    u.)  on  a  multiprocessor  system  Q=    (n  ,  ru,  . ..,  tl ; 

b  ,  b  ,  . ..,  b  ).  Again,  let  us  choose  the  r  longest  tasks  in  3.      Let 

to  denote  the  completion  time  of  these  tasks  when  they  are  executed  on  (? 

according  to  a  priority  list  L  .   Let  L  denote  the  priority  list  obtained 

by  appending  to  L  an  arbitrary  assignment  of  priorities  to  the  remaining 

tasks  in  3  •   Let  w  denote  the  corresponding  completion  time,  and  let  w' 

denote  the  completion  time  according  to  an  arbitrary  schedule.   When 

w  >  oj  and  b,  =1,  we  have, 
r  k  '  ' 

Theorem  3.^- 


U) 


7  <  i 


1 
+  Q 


Q  Z  n.  b. 
1-1  X     X 


where 


Q  =  max 


min 
l<j<k| 


"■ 

~ * 

r+1 

k 

Z 
1=1 

n.  fb 
l1    i 

1 

ill 


M-1 


\ 


r+1 


'   k 

/    E 
'    1=1 


E  n.  b. 
l  l 
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k 
Moreover,  the  "bound  is  "best  possible  when  "b.  are  integers  and   Z  n.  b. 

i=l  X  X 
divides  r. 

Proof.   The  proof  is  similar  to  that  of  Theorem  3.3.   Let  T,,  T~,  ....  T 

1   2     '      r 

denote  the  r  longest  tasks  and  T  -  denote  the  (r+l)st  longest  task  in  *7. 
It  is  clear  that 

rU.    U. 


w  = 
r    k 


1     /ul    2  \ 

(b-  +  b7  +  --+\+Ii  +  I2.  +  •••  +Ik) 


Z  n. 
1=1  1 

where  Un,  LL,  ....  II  and  I_,  I_,  ....  I  are  as  defined  in  the  proof  of 
1'      23  k      1  '2'  '      k 

Theorem  3.2.   This  expression  of  co  can  be  rewritten  as 

r 


1 

co  = 


r    k 

Z  n. 

1=1 


•  1    k       k  b  -b.  .U.     x    k  b.    "] 

z  U  +  Z  -^  ^  +  I± )  +  Z  ^  I      (3.16) 
1  i=l     1=1   i  V  i    /   i=l  1   J 


Moreover  „    , 

,  Z  n.  b. 

1    k  1=1  X     1 

-   ^  U  <— co-  (3.17) 

1  1=1  1 


and 


U. 

(r-i  +  I.)  =  n.  co  (3-18) 

b.    i     i  r  w   / 


i 

To  bound  the  third  term  in  the  right  hand  side  of  eq.  (3-l6),  we  note  that 
if  there  is  no  idle  period  in  a  processor  of  speed  b.  then 

I-        M-(T  ^  ) 
and 

V1 "       bi 


27 


Hence 


k     b. 

Z     ri  I.    <    max 
i=l     1         ""  Ki<k 


k     ^     ^(Tr+1)    |    V^-1)     ,(Tr+1) 


b 
1=1       1 


& 


& 


Li^ 


=     max 
KKk 


Z     n     b. 
i=l     i     X 


-  i 


«Vl> 


Z     n.    b.    -  1 
1=1 


l     l 


b. 


"'W 


Thus   eq.    (3-l6)  becomes 


co     <  — = 
r  —     k 


2     n. 
i=l     X 


Z     n.    b.  .      ,      , 

.    ,      11  k     b   -b. 

i=l  .  1     l 

■ co'  +     Z     — : n.    co 

1  i=l  1 


Z     n.    b.-  1 
i=l     X     1 


(3-19) 


We  note  that  in  any  schedule,  there  must  be 


r+1 


Z  njb.l 
l  i 


i=l 


b^-Cfb^-l) 


of  the  r+1  tasks  T.,,  T  ,  . .,,  T  ,  being  executed  on  a  processor  of  speed  b 
for  some  I,    0  <  £  <  k.   Consequently,  we  have 


w'  >  ^Tr+1} 


r+1 


Z  n.[b." 
i=l  x  X 


r^i      I^L-1 


(3-20) 


for  some  g,  =  1,   2,    ...,   k.      Since 


k 

(r+1)   |i(T    ,  ,  )   <   (   Z     n.b.  )   co' 
r+1     —     .    ,      11 
i=l 
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and  eq.  (3-20),  we  write 


QuOr,-,)  < 


r+1'  - 


where 


Q,  =  max 


mm 


Equation  (3-19)  becomes 


•• 

r+1 

k 

Z 

i=l 

n.[b.] 

1        X 

'1A     !hl±)  _-'- 

b.  b4       |'     k 

Z     n.b. 
1=1     X  X 


co     <     — 
r  k 


Z     n. 
i=l     x 


k 


,      ,  ,  Z     n.   b.    -  i 

,  k     bn    -  b.  .    t      l     l 

s     n.    b.      HI  +     E     _2^_i  n.    u     +  izi «L 

i=l     X      X     bl        1=1       bl  X      r  \  Q 


which  simplifies  to 


Z  n.  b.  oj  < 
.  -,  l  i  r  — 
i=l 


■  k 

z 


,    n.    b.    + 
i=l     i     l 


k 

Z     n.   b. 

i=l     X     X 

Q 


.  ±1 

QJ 


Q  '     ' 


That  is 


"Wi  1 


Q  Z  n.  b. 
.-,11 
i=l 


For  integer  values  of  b.  and  Z  n.  b.  divides  r,  the  bound  becomes 

l  .    ,      l     l 

1=1 


k 

/.,       ,  n     Z     n.b.    -  b 
co          rbn  +  (b^+l)    .,ii          1 
r  <        11         i=l 

co'   -  k 

rb_    +     Z     n.b. 
1       i=1     xi 


(3-21) 


and  is  best  possible.      That  the  bound  in  eq.    (3-21)   is  best  possible  can  be 
demonstrated  by  an  example   similar  to  the  one   shown  in  Figure  3-3. 
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Another  simple  heruistic  for  assigning  priorities  to  the  tasks  in 
(fT,    (j.)  is  to  assign  higher  priorities  to  longer  tasks.   Let  w  denote  the 
completion  time  when  a  set  of  tasks  {??,    \x)   is  executed  on  the  multiprocessor 
system  (P  =   (l,  n;  b,  l)  using  a  priority  assignment  according  to  the  lengths 
of  the  tasks.   Let  W  be  the  completion  time  of  an  arbitrary  schedule.  We 
have 
Theorem  3.5 

-t  <  ^b) 

oj'  -   b+2  '   - 


and 


u,  <  2?,  D  >  2 


oo'  -  2 

Proof.   Consider  the  priority-driven  schedule  corresponding  to  the  priority 
assignment  according  to  the  lengths  of  the  tasks.   Let  U,  and  LL  be  the  sums 
of  the  execution  times  of  the  tasks  run  on  the  fast  processor  and  the  standard 
processors,  respectively.   Let  I,  and  Ip  denote  the  sums  of  the  lengths  of  the 
idle  periods  of  the  fast  processor  and  the  standard  processors,  respectively. 
The  completion  time  is 

u  =  57i  [T  +  u2  +  Ii+y  (3-22) 

We  consider  two  cases : 

Case  1.   I  =  0,  that  is,  the  fast  processor  is  never  idle.   In 

this  case, 

1   Ui 
oj  =  —.   r-±  t  L  +  I 
n+l  '  b     2    2J 

[-TT^  +  ^Co+^+ry  (3-23) 


n+l  L   b      b   v  2  2J       b  2- 


t 

Note  that  the  case  I,  -   IQ  =  0  needs  not  be  of  concern  to  us  since 

when  I 


,  =  Ip  =  0,  we  have  co  =  co ' . 


30 
where 

i  (U1+U2)  <^o>'  (3-21+) 

and 

U2  +  I2  =  n  w  (3-25) 

Let  us  note  that  if  the  fast  processor  executes  only  one  job  according  to 

this  schedule,  then  co  <  co'.   Therefore,  we  can  assume  that  the  fast  processor 

executes  two  or  more  jobs.   Let  T  be  the  last  job  executed  in  the  fast 

processor  and  cp  be  the  longest  idle  period  among  the  idle  periods  in  the  slow 
s 

processors  as  illustrated  in  Figure  3-*+.   Since 

^(Tr}  >n(y.) 


and 


u(Tr)  <  w  -  u(cps) 


We  have 


Moreover, 


Hence 


w  >  n(T  )  +  u(cp  )  >  (b+1)  u(cp  ) 


*Hi(<Ps)  >  I2 


n  oj  >  (b+1)  I2 

Substituting  this  expression  and  eqs.  (3-2*0  and  (3-25)  into  eq.  (3-23),  we 

obtain 

.  1  rn+b   ,   b-1        nco  _ 

-  n+1  L  b       b       b(b+l)J 


which  simplifies  to 


(o    (b+l)(n+b)  (3_26) 

co'  -  b(n+b+l)  v°   ' 
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^ — i — h 


T 
r 


n(Tp)/b) 


^ 1- 


^\ v 


n+1 


(a) 
Case  1:   T  is  executed  on  the  fast  processor 


T 

r 


H v 


<t> 


3 


(b! 


Case  2:  T  is  executed  on  a  slow  processor 


Figure  3-^ 
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Case  2.   1^0,  that  is,  at  least  one  of  the  slow  processors  is  never  idle. 

Suppose  that  a  processor  that  is  never  idle  is  P..  We  consider  the  two 

J 
oo 
possibilities,  b  >  2  and  b  <  2.   For  b  <  2,  I  >  —  implies  that  at  most  one 

task  is  executed  on  processor  P..   Consequently,  the  first  task  executed  in 

the  fast  processor  is  not  the  longest  task  which  is  a  contradiction  to  the 

assumption  that  higher  priorities  are  assigned  to  longer  tasks.  Therefore, 

for  b  <  2,  I  <  |.   Thus,  eq.  (3-22)  becomes 


u  -  m:  £  <W  +  ¥  (U2+I2> +  t  +  y 


„     1     rn+b      ,        b-1  n-1  w 

<  — -   [— —  to'    +  — —  n  w  +  ^—   (o  +  -] 
-  n+1   L   b  b  b  2J 

That  is 

a)         2(n+b) 

coT   -     b+2 

Similarly,  we  claim  that  I,  <  — —  w  for  b  >  2.   (if  I_  >  — f—  oo,  each 

1  —  b         —         lb 

slow  processor  executes  at  most  one  job.   Let  T.  be  the  task  whose 

execution  time  u(T.  )  is  equal  oj.   Let  T  be  the  longest  task  in  5^.   Since 

T  is  executed  in  the  fast  processor  and 
r 


■<V  _  i 


<  =■   00 


b  b 

we  have 

u(Tr)  <  u(T±) 
which  is  impossible. )      Therefore, 

.     1      rn+b      ,        b-1  n-1  b-1     _ 

oo  <  — — -   r-r—  oo'    +  — —  n  oo  +  — —  oo  +  — —  oo] 

—  n+1   L  b  b  b  b        J 

or 

-     <  n+1° 
oo'   -     2 
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Combining  case  1  and  case  2,  we  have  for  b  <  2 


-     <  2(n+b) 

to'  -  b+2 


since  for  n  >  1  and  b  <  2, 


(b+l)(n+b)  <  2 (n+b ) 
b(n+b+l)  -  b+2 


Similarly,  we  have  for  b  >  2 


w    n+b 
go'  -  2 


Since  for  n  >  1  and  b  >  2 


b(n+b+l)   -   2  u 


3.3  Scheduling  on  Different  Systems 

We  now  extend  our  discussion  to  the  case  of  executing  a  set  of 
tasks  on  different  multiprocessor  systems.   Let  {^T ,    \i,    <)  be  a  given  set 
of  tasks  and  (P   =   (n.,  n  ,  . . . ,    n  ;  b  ,   b  ,  ...,    b  )  and  (P  '  =  (n',  n',  ..., 
n';  b',  b',  . ..,  b')  be  two  multiprocessor  systems.   Let  co  be  the  completion 

iC    -L    2  i£ 

time  when  ( Zf ,    u,  <)  is  executed  on  (P   according  to  a  priority-driven 
schedule.   Let  to'  be  the  completion  time  when  {^ ,    u,  <)  is  executed  on 
(P*   according  to  an  arbitrary  schedule.   Extending  the  result  in  Theorem  3.2, 
we  obtain 
Theorem  3.6 


u)'  -  h 
k 


£->           11  .         >~>  . 
.,11 
1=1 

bi 

k 

Z     n.   b. 
i=l     X     X 

k 

Z     n.    b. 
i=l     X     1 

3^ 


Moreover,  the  "bound  is  the  best  possible. 

Proof.   The  proof  of  this  theorem  is  similar  to  that  of  Theorem  3.2  and 
will  only  be  outlined  here.  As  was  discussed  in  the  proof  of  Theorem  3.2, 
the  completion  time  of  the  priority- driven  schedule  is  given  .by  eq.  (3-6). 
Again,  we  have 

U. 

r^-  +  I.  =  n.  to 
b.    l    l 

l 

and 


k 

Z  U. 
i=l  : 


k« 

Z  n'  b! 
i=l  ±     1 


<  co' 


(3-27) 


Similar  to  eqs.  (3-9)  and  (3-10),  the  sum  of  the  lengths  of  the  idle 
periods  of  any  processor  must  be  such  that 


b.  I. 
k  l 


<  b'  w< 


n.   -  i 

i 


(3-28) 


and 


k 
b   Z  I. 
ki=l  X 


<  b*  w* 

-  1 


Z  n.  -  1 
i=l  X 


(3-29) 


Thus,    eq.    (3-6)  becomes 


to  < 


k 

Z    n. 
i=l     x 


"kr 

Z     n!   b.'  ,  ,      . 

.    -,      i     i  k  b  -b. 

i-l  ,          „  1     i 

= W    +     Z  — : — -  n.    OJ 

bn  .    ,  b,        l 

1  i=l  1 


k       b.-b,  b' 
i     k       1 

Z     n.-l        wf    +      Z       — r —  t— ■  n.    W 

b       b  •    -,      i      I  ■    i  b,  b.      l 

1       k      \i=l  /  i=1  1         k 


bn      b' 
k       1 
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That  is 

k  k'           k 

bn   Z  n.  b.  oo  <  (h   Z  nj  b.'  +  b'   Z  n.  b.  -  bn  b '  )  u>f 

k.,11  -  v  k  .  n   1  1    I.-.11    k  1' 

i=l  i=l           i=l 


or 


Z  n.'  b! 
bj    .  .   i  i        b 


W  <^  +  i=  1 


oo»  -  b     k  k 

Z  n.  b.      Z  n.  b 
1=1  x  1     i=l  x  x 


The  example  in  Figure  3-5  demonstrates  that  this  bound  is  best  possible.  □ 

k*         k 

Let  us  note  that  if  Z  n!  b.'  =  Z  n.  b.,  the  bound  in  Theorem  3-6 

i=l  x     1   i=l  x  * 


becomes 


b'  b' 

".  <  TT   +  1  "  T —  (3"3°) 

oo'  -  b        k 

k       Z  n.  b. 

i=l  x  X 

That  is,  for  two  multiprocessor  systems  with  the  same  maximal  throughput, 
the  bound  is  mainly  a  function  of  b'  and  b  .   Thus,  if  we  hold  the  value 

J_        K. 

of  bT  and  b  fixed,  we  can  trade  a  smaller  number  of  fast  processors  for 
a  large  number  of  slow  processors  or  conversely  without  changing  the  bound 

on  the  worst  case  performance  of  priority- driven  scheduling  algorithms. 

k 

For  example,    suppose  b'      is  approximately  equal  to  b     and     Z     n.    b.    is 

1  k  i=1  i  i 

very  large,  then  the  ratio  -,  is  approximately  upper  bounded  by  2.   In  this 

case,  P    is  a  system  containing  a  small  number  of  fast  processors  and  (P   ' 

is  a  system  containing  a  large  number  of  slow  processors.   The  result  in  (3-30) 

says  that  any  arbitrary  priority- driven  schedule  for  (P   is  never  worse  than 

the  best  possible  schedule  for  ^?'  by  a  factor  of  2.   This  indeed  is  a 
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quantitative  confirmation  of  the  fact  that  a  processor  of  speed  b  is  more 
desirable  than  b  slow  processors  of  speed  1. 

There  is  another  interesting  interpretation  of  the  result  in 
Theorem  3.6.   Suppose  that  we  are  to  execute  a  set  of  tasks  on  the  system 
f)  =  (n  ,  ru,  . ..,  tl   ;  b  ,  b  ,  . ..,  b  ).   Instead  of  using  all  the 
processors  in  the  system,  one  might  choose  only  to  use  the  n  fastest 
processors.   Let  w„  denote  the  completion  time  when  a  set  of  tasks 
( 5^,  \i,   <)   is  executed  on  the  n  fastest  processors  according  to  a 
priority- driven  schedule  and  let  W  be  the  completion  time  when  the  set 
(5^,  u,  <)  is  executed  on  (P   according  to  an  arbitrary  schedule.  According 
to  Theorem  3.6,  we  have 

—r     <     r—     Z  n.  b.  +  1 

oj'  —  nn  b.,  .  .,   ix       nn 
1  1  i=l  1 

To  compare  the  upper  bound  of  —  with  the  upper  bound  of  —  given  in 
Theorem  3.2,  we  plot  in  Figure  3-6  the  values  of  b  for  which  the  upper 

bound  of  — ,  becomes  larger,  than  that  of  —7  for  fixed  values  of  nn  and 

k 

Z  n.  b..  Without  loss  of  generality,  assume  that  bn  =  1.  When  bn  is 
.,11  k  1 

i=l 

larger  than  that  shown  in  Figure  3-6  one  is  assured  of  an  improvement  on 
the  worst  case  performance  of  a  priority-driven  scheduling  algorithm  by 
using  only  the  fastest  processors  in  the  system. 

Another  interesting  problem  is  to  compare  the  execution  of  a  set 
of  tasks  on  two  different  multiprocessor  systems  using  a  priority- driven 
schedule  specified  by  the  same  priority  list.  This  is  a  realistic 
situation  when  some  of  the  processors  in  an  existing  system  were  replaced 
by  processors  of  different  speed  yet  the  priorities  assigned  to  the  tasks 
to  be  executed  were  not  changed.   To  illustrate  the  point,  we  investigate  a 
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ko 

special  case  in  which  a  set  of  tasks  (u,    u)  is  to  be  executed  on  two 
multiprocessor  systems  (P   =  (n+1;  1)  and  (P*   =  (.1,  n;  b,  l)  according  to 
the  same  priority  list  L  .   Let  w  and  cof  denote  the  respective  completion 
times. 
Theorem  3.7 

The  ratio  -,  is  bounded  by 

CO 


00 

2(n+b) 
n+2 


b  <  2 


b  >  2 


Proof.      Suppose  that  there  are  n+1  or  fewer  tasks  in   *f  .      Clearly, 

"  =  n(Tx) 
and 

W   >^  u(Ti) 

where  T-  is  a  task  with  the  longest  execution  time.   It  follows  that 

w 

-.  <  b 

U)1  — 

Suppose  that  there  are  more  than  n+1  tasks  in  J    .   Let  U  denote  the 
sum  of  the  execution  times  of  the  tasks  in  7   and  let  I  denote  the  sum  of 
the  lengths  of  idle  periods  of  all  processors  when  {tF,    u)  is  executed  on 
the  system  (P  =    (n+1,  l).   Clearly, 


CO 


=  FT1  ^U+I)  (3-3D 


Let  T  be  the  last  task  on  a  processor  which  is  never  idle  in  the  time 
interval  (0,  co).   The  length  of  the  idle  period  of  any  processor  must  be 
equal  to  or  less  than  |i(T  ).   Hence 


I  <  n  ti(Tr) 


But 

U 


H 


(T  )  < 


"r/  -  n  +  2 
Combining  these  two  inequalities  with  eq.  (30),  we  have 


,       1   /tt   nU  \ 

U)  < (U  +  ;r.) 

-  n  +  1  v    n+2' 


nU 


n  +  2 


Since 


go 
the  ratio  — ,  is  bound  by 


<*•  >    u 


-  n  +  b 


Since 


u         2 (n+b ) 
u»   -  n  +  2 


2  (n+b) 


1     <     -,      <     2 
-co'      — 


for  any  priority  list   L  and  the  bound  is  best  possible. 


iH 


for  b  >  2,    the  theorem  follows  immediately.  □ 

Corollary  1.   For  the  case  of  b  =  2  and  n  =  2,  we  have 
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k.      PERFORMANCE  OF  PREEMPTIVE  SCHEDULING  ALGORITHMS 

In  this  section,  we  study  algorithms  which  produce  preemptive 
schedules  with  minimum  completion  time.   By  a  preemptive  schedule,  we  mean 
one  in  which  the  execution  of  a  task  may  he  interrupted  before  its 
completion.  We  assume  that  the  cost  associated  with  task  preemption  is 
negligible  throughout  our  discussion.   This  assumption  is  justified  in 
systems  where  task  preemption  does  not  require  the  reloading  and  removal 
of  tasks  in  and  out  of  the  main  memory. 

k.l     Performance  Improvement  Obtainable  by  Preemptive  Scheduling 

The  bound  on  improvement  in  completion  time  of  a  set  of  tasks 
executed  on  a  system  (P   =  (n  ,  n  ,  . ..,  ri  ;  b  ,  b  ,  . ..,  b  )  according  to 
a  preemptive  schedule  over  that  using  an  arbitrary  priority-driven  schedule 
can  be  obtained  directly  from  Theorem  3.2.   Indeed,  this  theorem  can  be 
restated  as 
Theorem  ^t-.l 

Let  w  be  the  completion  time  when  a  set  of  tasks  {^7 ,  u,  <)  is 
executed  on  the  system  (P  =    (n.,,  n  ,  . ..,  ri  ;  b  ,  b?,  ...,  b  )  according 
to  a  priority-driven  schedule  and  w  be  the  completion  time  when  the  set 
is  executed  on  (P   according  to  an  optimal  preemptive  schedule.   Then 


wt>t   bn  b_, 


-  co  —  "b        k 

P    k       Z     n.  b. 
i=l  X  1 


where  b  >  b   >  ...>b  .  Moreover,  the  bound  is  best  possible, 


J+3 


As  noted  in  Section  3>  for  systems  of  large  maximal  throughput 

k  b 

(  Z  n.  b. ),  this  bound  approaches  —  +  1.   Hence,  when  the  speed  of  the 
i=l  k 

fastest  processor  in  the  system  is  very  large  compared  to  that  of  the 
slowest  processor,  significant  amount  of  reduction  in  completion  time  of 
the  set  {7 ',  u,  <)  can  be  achieved  by  allowing  task  preemption. 

Let  us  note  again,  that  in  a  priority-driven  schedule,  processors 
are  never  idle  when  there  are  tasks  ready  to  be  executed.  We  expect  the 
improvement  in  completion  time  attained  by  preemption  to  be  less  for 
non-preemptive  schedules  in  which  processors  are  allowed  to  idle  when  there 
are  executable  tasks. 

k.2     Performance  of  Preemptive  Schedule  for  Independent  Tasks 

Let  us  for  the  moment  consider  the  special  case  in  which  the 
precedence  relation  <  over  the  set  of  tasks  is  empty.   That  is,  the  tasks 
in  Z7*  =  {T,,  T  ,  ...,  T  }  are  independent.   Suppose  that 
u(T  )  >  u(T  )  >  u(T  )  ...  >  u(T  ).   Let  w  denote  the  completion 
time  of  the  set  (&" ,  \i)   when  it  is  executed  on  the  system  0-*  =  (1,  n;  b,  l) 
according  to  an  optimal  preemptive  schedule.  We  have 

Lemma  k . 1 


oj  >  max  j   max 
P- 


f 


r  J 


n<V 


)       max     E     l      „  ±_ 

^l<j<n+lLi=l  b+j-1  )>    /^  n+b 


m  n(T,  ) 


Proof.   It  is  clear  that 


m  u(T  ) 

wx,  ;   2  ±- 

P-i=1  n+b 


kh 


since  no  schedule  can  have  a  completion  time  shorter  than  the  one  which  keeps 
all  of  the  processors  busy.   To  show  that 


w  >    max 

P    l<j<n+l 


■  i  ,(v 


_i=l  b+j-l„ 

let  us  look  at  the  individual  terms  inside  the  brackets.   They  are 

u(T1)     u(T1)+u(T2)     u(T1)+u(T2)+u(T3)         ud-^+u^ )+. .  .+u(Tn+1 ) 
— b   >  b+1     '  b+2        >    •">  b+n 


is  equal  to  the  time  required  to  execute  task  T-,  on  the  fast 


— -   _  ~^^~  . ~~  *.~^~.~  ~      -—  ^1 

processor,   w  is  equal  to  or  larger  than  this  quantity  since  there  is 

no  way  to  complete  the  execution  of  T.,  in  shorter  time.   Similarly, 

t— r  [u(T-.)  +  |_i(Tp)]  is  equal  to  the  shortest  possible  time  to  complete 

the  execution  of  tasks  T-,  and  Tp  and  ,  .  j   [|i(T1)  +  u(Tp)  +  ...  +  \i(T.  )] 

is  equal  to  the  shortest  possible  time  to  complete  the  execution  of  tasks 

T, ,    T  ,  . ..,  T. .   Clearly,  w  is  lower  bounded  by  the  maximum  of  these 

quantities  since  no  schedule  can  have  a  completion  time  shorter  than  the 

time  required  to  execute  the  i  longest  task  on  i  processors.  □ 

Indeed,  the  lower  bound  on  the  optimal  completion  time  in  Lemma  h.l. 

can  be  achieved.   In  other  words,  we  have 
Theorem  k.2. 


co     =  max    /        max 


^  l<j<n+l 


Li=l     b+j-U 


i=l  n+b  J 


Proof.  We  shall  prove  the  theorem  by  finding  a  schedule  for  the  tasks 
(fT,  u)  on  the  system  (P  =  (l,  n;  b,  l)  whose  completion  time  is  equal  to 
a)  given  by  eq.  (U-l).  We  consider  the  three  cases: 


We  note  that  in  any  valid  schedule,  a  task  cannot  be  assigned  to  two 
different  processors  simultaneously  at  any  time. 


^5 


Case  1. 


m  n(T  )  j   n(T  ) 

Z    J"        max    Z  .  .  =  (U-2) 

n+b  -   .  ^._  -,  .  -,   b+n-1 
i=l  i<o<n+l  i=l   d 


In  this  case,  we  want  to  find  a  schedule  whose  completion  time  is  equal  to 

m  |i(T  ) 
i=l 

To  do  so,  let  us  suppose  that  u(T. )  <  w  for  all  i  =  1,  2,  . ..,  m.   Let 

p  be  an  integer  which  is  such  that 

Px-1  P-l 

E   =   ^Ti)<WP<^.Zn  ^Ti) 
i=l  i=l 

We  assign  the  tasks  T-.,  Tp,  . ..,  T   -,  to  the  processor  P, .   Let 

T,  =  r   Z   u(T.).  We  assign  the  task  T   to  the  processor  P,  in  the  time 
1   b   .  ,     l  p,  1 

1=1  ^1 

interval  (t,,  wp)  and  to  the  processor  Pp  in  the  time  interval  (0,  t-J  ) 
where 

Ti-H(^  )  -  b(VTl) 


The  tasks  T   -,  T   p,  . ..,  T       are  assigned  to  the  processor  P0 

P-i   |"-L  P-i"1"^  P-i   'Pp  — -L  £- 

p     is   such  that 


where 


P0-l  p. 


t'    +     Z     u(T     J.j)<wT3<T'    f     Z       u(T      .,) 

_.       -,  P  n+l       —        P  1  •       .  P,+l 


^     H(T        .)<  a)     <T»    *■     Z 
i=l  V1  P         X        i=l 

Let 

P2-l 

T     -  T«    +     Z     u(T  ) 

i=l        pl 

and 


ke 


T2  =  ^Tp1+p2>-  <VV 

The  task  |_l(T     )  is  assigned  to  the  processor  P  in  the  time  interval 
pl  p2  d 

(^2>   wp)  and  "to  the  processor  p  in  the  time  interval  (0,  t').   Similarly, 
we  let  the  integer  p,  be  such  that 

where 

T„  =  t'   +  E  u(T  ^         .  ) 
and 


We  assign  the  tasks  T  ,,  T  ,  . . . ,  T  to  thp 

P-L+.  •  -P^_!+1   Pl+P2+Pi-1+2     '   p^..  .  tp^_!  t0  tiie 

processor  P  and  the  task  T        to  the  processor  P.  in  the  time 
i>  pl    l>  & 

interval  (i g>    wp)  and  to  the  processor  P„  1  in  the  time  interval  (0,  t«). 

We  proceed  in  this  manner  until  the  m  tasks  are  all  assigned.   The 

resultant  schedule  is  as  shown  in  Figure  k-1  where  T  is  completed  at  the 

time  w^  on  the  processor  P  _ . 
P  ntl 

Suppose  that  the  execution  times  of  some  of  the  tasks  are  larger 
than  w    In  particular,  suppose  that  u(T-,)  >  u(Tp)  >  ...  >  \x(T   )  >  co 

We  now  show  that 

m 
wp  <   Z   p.(Ti)  <  hwp 
i=n+l 


hi 


>+8 


Since 


and 


we  obtain 


That 


m 

Z     Ji(T.) 

i=i -      =      CO 

b+n  P 


n 

Z     H(T.)  >  nw 
i=l 


m 

Z       u(T   )   <bw 
i=n+l 


m 
co    <       z       [i{l   ) 

r       i=n+l         x 


follows  from  eq.    (^-2),    that  is, 


n  m 

Z  n(T.)      Z  n(T  ) 

i=l  i=l 

b+n-1  -  "   b+n       WP 


m 


If  we  let  T'  ,  be  a  new  task  such  that  u(T'  .. )  =   Z   u(T.  ),  we  can 
n+1  ^  n+ly   .    .,  ^  i 

i=n+l 

schedule  the  n+1  tasks  Tn,  T„,  ....  T  ,  T'  n  as  shown  in  Figure  h-2.   where 

1  2'  7   n   n+1 

M-(T±)  -  wp 
Aj_  = ^z± i  =  1*  2j  •••>  n 

and 

^v  n+ly    P 
\+l  ~  b-1 

Since 

bA.  +  u _  -  A.  =  n(T.  )     i  =  1,  2,  ...,  n 

l    P    l^i 


h9 


<f 


<r 


■r 


3* 


<r 


<f 


<^ 


<f 


<f 


i 
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and 

bVl  +  WP  -  Vl  =  ^Tn+1> 

and 

n+1 

E  A.  =  w„ 
.  -,  i    P 

the  schedule  in  Figure  k-2   is  a  valid  one. 

Suppose  that  there  is  only  j  tasks  with  execution  time  longer 

than  Up,  and  j  <  n.   That  is,  ^(T-)  >  u(T„)  >  ...  >  u(T.)  >  u  but 

u(T.  t)  <  u    In  this  case,  we  let  T!  ,  be  a  new  task  such  that 
^v  3+lJ   -  P  0+1 

m 
u(T'   )  =   Z   u(T.).  We  can  use  the  schedule  shown  in  Figure  h-3>   where 
'  J      i=j+l    X 

H(T±)  -  ^p 
\  =  £T[ i  =  1*  2,  . . .,  j 


and 


3  t 

3+1    P   i=1   i 


Since 


b^  +  wp  -  /\±   =  n(T±)      i  =  1,  2,  ...,  j 


f 
This  can  be  done  because         . 

Z     u(T   )   -  ju 
i=l 
Aj+1  =  WP  "  b^l 

J 

(b+j-1)   Up  -     Z     n(T±) 

_     i=l 

b-1 

3 

2     [i(T    ) 

i=l 

follows  from  u       >    —      .     — 

P    -         b+j-1 


>     0 
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<?> 


3ft 
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and 


V  +  wp"V+  (n"J)  wp 


=   (la-1)  A.+1  +   (n-d+1)  wp 


2     n(T±)   -  jwp 
=  (b-1)   [«p  -  — ^1 3   +  (n"J+1)  wt 


=    (b+n)   wp  -     Z     n(T±) 
i=l 


m 

Z       H(T.) 
i=j+l  X 


the   schedule  in  Figure  k~3  is  a  valid  one, 
Case  2. 

Suppose  that 


hCf-l) 

"P  ~  ~"  b 


w. 


From  eq.    (k-l),   we  note  that 


(l^)   +  n(T2)  u(T1) 


b+1 


< 


It  follows  that 


■  (To) 


H(TJ 


2'  -    b 


(M) 


Also, 


m   u(T. ) 
Z   — ±- 
i=2    n 


m   u(T.  )   ,, 
x   n+b 


i=2 


n+b    n 


n+b 
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m   u(T  ) 
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\l(\) 


Li=l 


n+b       n+b 
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n+b 
—   n 


'ki(T1)     ki(T1)' 


-  b 


n+b  _ 


h(T-l) 


(WO 


Therefore,  the  schedule  shown  in  Figure  k-h   can  be  used.   In  this  schedule, 

T-,  is  assigned  to  the  fast  processor  P  and  the.  remaining  tasks  are 

assigned  to  Pp,  V  ,    ...,    P  , .   The  relation  in  (h-3)   and  (h-k)   guarantee 

the  validity  of  the  schedule. 

Case  3. 

3  H(T.) 


for  some  j,  1  <  j  <  n+1.   It  is  clear  that 


^(T.)<03p 


1=1,  2,  . ..,  j 


0*-5) 


However,  since 


which  can  be  rewritten  as 


j-l     fi(T.) 


j-l     n(T.) 

ifi  b+J-2 

b+j-l 

b+j-2 

<     «p 

(i(T.) 

oj     -  

P       b+j-1 


we  have 


wp  <  u(Tj) 


(U-6) 
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ef 


ef 


b/" 


On  the  other  hand  it  follows  from 


55 


0+1     n(T.) 


i=l 


b+o        ~        P 


(That  is, 


b+j-1 
b+j 
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z 

Li=l 


^Ti) 
n+b 


b+,1-1 
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— < 
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that 


Moreover,    from 


H<Vl>     ^    «p 


m  |i(T.  ) 

E  —     <     w 

.    _  n+b       -       P 
i=l 


(W7) 


we  have 


b+j-1 
n+b 


J        [i 


(Tj 


i=l 


b+j-1 


n-j+1 
n+b 


m 
Z 


i=J+l 


H(T.  ) 

— r=r     <     wt, 

n-j+1     -       P 


or 


v    •    n  .    ,  m  ^(T.  ) 

EStl  u       +     n=fl+l         z  ULil     <     a, 

n+b        P  n+b        .     .    .  n-j+1     -       P 

1=0+1 


which  simplifies  to 


m 
i-J+1 


n-j+1 


f,i 


(h-Q) 


Therefore,  the  schedule  shown  in  Figure  U-5  can  be  used.   In  this  schedule, 
the  tasks  T,,  T  ,  ...,  T.  are  assigned  to  the  first  j  processors.   The 

1.  d.  J 


execution  time  of  the  portion  of  T.  assigned  to  P.  ,  A.,  is  given  by 

JP 


n(T. )  -  t 


*1 


b-1 
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<r 


<f 


<^ 


<^ 
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for  i  =  1,   2,  ...,    j.   The  remaining  tasks  are  assigned  to  processors 

P. ■  _,  P.  0,  ....  P  , .   The  relations  in  (hr-^)-(k-8)   and 
3+1   J+2        n+1 


Z 

i=l 


EVUI 


guarantee  the  validity  of  the  schedule. 

The  result  in  Theorem  h.2   can  he  extended  immediately.   Let  oj 
denote  the  completion  time  of  a  set  of  tasks  (^,  u)  executed  on  the  system 
(P  =  (n  ,  n  ,  ...,  n  ;  b  ,  b  f    ...,   b  )  according  to  an  optimal  preemptive 
schedule.   Similar  to  Theorem  k.2,   we  have 


co       -     max 
P 


max 


Z     u(T    ) 
i=l         x 


l<j<n1+n2+...nk     £ 


i+1 


m 

2  u(T  ) 

k 

Z  n.  b 


i=l 


i  i 


where  i  is  an  integer  such  that 


I  £+1 

Z  n.  <  j  <  Z  n. 


i=l 


i=l 
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5.   SCHEDULING  ALGORITHMS  TO  MINIMIZE 
MEAN  FLOW  TIME 


In  this  section,  we  study  the  performance  of  scheduling  algorithms 
using  mean  flow  time  as  the  criterion  for  comparison.  We  shall  consider 
only  the  special  case  in  which  the  precedence  relation  <  over  the  set  of 
tasks  is  empty.  Algorithms  producing  schedules  with  minimal  mean  flow 
times  will  be  described.  A  bound  on  the  minimum  mean  flow  times  for 
different  multiprocessor  systems  will  be  derived.   These  bounds  will  also 
provide  us  with  information  concerning  the  relative  merit  of  different 
multiprocessor  systems. 

5.1  Optimal  Scheduling  Algorithms 

We  present  now  an  algorithm  for  constructing  schedules  with 

minimum  mean  flow  time  when  a  set  of  independent  tasks  ( ?7 ,   u)  is  executed 

on  the  system  (P  =  (l,  n;  b,  l).  We  note  that  the  mean  flow  time  of  a 

schedule,  t,  defined  in  Sec.  2,2  can  be  written  as 

m 

t  =  Z  w  u(T  )  (5-1) 

i=l 

m. 

where  w.  =  —  if  task  T.  is  executed  on  the  fast  processor  and  is  followed 
x    b  i 

by  m  -1  tasks  in  that  processor  and  w.  =  m.  if  task  T.  is  executed  on  a 

i  x    x  x 

standard  processor  and  is  followed  by  m.-l  tasks  in  that  processor. 

Hence,  a  schedule  that  minimizes  the  mean  flow  time  is  one  that  minimizes 

the  weighted  sum,  in  eq.  (5-1)  overall  valid  sets  of  weights  {w,,  w~,  ...,  w  } 

Suppose  that  the  execution  times  of  T, ,  T„,  ...,  T  are  such  that 
l_i(Tn)  >  u(T  )  ...  >  u(T  ).   Let  r  be  an  integer  such  that 

[rbj  +  rn  <  m  <  [  (r-i-l)bj  +  (r+l)  n  (5-2) 
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where  [xj  denotes  the  integer  part  of  x.  We  have, 
Theorem  5«1 

When  a  set  of  tasks  {f7,    u)  is  executed  in  the  system  (P  =   (l,  n;  b,  l), 
a  set  of  weights  {w,,  wp,  . ..,  w  }  that  minimizes  the  mean  flow  time  is 


1 
b 


w.  = 


x-sn 


v 


i  =  1,   2,    ...,  [bj 

i  =  [bj  +  1,  [bj  +  2,  ...,  [bj  +  n 

i  =  [sbj  +  sn  +  1,  [sbj  +  sn  +  2,    . ..,  |_(s+l)bj  +  sn 

and  s  =  1,  2,    . ..,  r-1  (5-3) 

i  =   [sbj    +    (s-1)  n  +  1,    [sbj    +   (s-l)  n  +  2,    ..., 
[sbj   +  sn,    and  s  =  2,    3,    •••>    ^ 


and 


w. 


i-rn 


i  =   |_rbj    +  rn  +  1,    [rbj    +  rn  +  2,    ..., 


in 


if  m  -   [rbj    -  rn  <  [(r+l)bj    -   [rbj,    and 


i-rn 


i  =  [rbj  +  rn  +  1,  [rbj  +  rn  +  2,  ...,  [  (r+l)bj  +  rn 


w.  = 
l 


r+1     i  =  [(r+l)bj  +  rn  +  1,  [ (r+l)bj  +  rn  +  2,  ...,  m 


(5-4) 


otherwise. 
Proof. 

We  show  that  there  is  a  schedule  for  which  the  weights  w.  are 

l 

that  given  by  eqs.  (5-3)  and  (5-*0.   To  construct  this  schedule,  we 
partition  the  tasks  in  *7   into  r+1  blocks,  B,,  Bp,  ...,  B    where 


6o 
Bl=  {T1>  T2'  •••'  TM'  T[bJ+l'  TLbJ+2'  ••"  TLbj+n3 


B2  =  {T[bJ+n+l'  T[bj+n+2>  •"  T[2bJ+n'  T[2bJ+n+l'  ""    T[2bJ+2n} 


Br  =  {T[(r-l)bJ+(r-l)n+l'  T[  (r-l)b]+(r-l)n+2'  '  "  T[rbJ  +  (r-l)n' 

TLrbJ+(r-l)n+l^  '"  TLrbJ  +  (r-l)n+n-l'  T[rb]+rn3 


Jr+1  =  {T[rbj+rn+l'  T|.rbJ+rn+2'  •"'  T[  (r+l)bj+rn'  T[ (r+l)bj+rn+l' 


In  the  schedule,  the  task  ^(r+OObJ+rn'  T[  (r+l)bj+rn-l'  ~"    T[rbj+rn+l  in 

B    are  assigned  to  the  fast  processor  in  that  order  and  the  remaining  tasks 
r+1 

in  B    are  assigned  to  any  m-l  (r+l)bj-rn  of  the  standard  processors  one  in  each 
r+1 

processor.   Next,  tasks  ^rbj  +  (r_l)n,  T[rbj  +  (r_i)n.r   •"  T[ (r-l)bj+(r-l)n+l 

in  B  are  assigned  to  the  fast  processor  in  that  order  while  the  remaining 
r 

n  tasks  in  B  are  assigned  to  the  n  standard  processors  one  in  each  processor, 
r 

This  procedure  is  repeated  until  the  tasks  Ti  fe  i,    Tj  -(-, !  _i/  '">    T2'  Tl  in  Bl 

are  assigned  to  the  fast  processor  and  the  remaining  n  tasks  in  B_L  are 
assigned  to  the  n  standard  processors.  The  resultant  schedule  is  shown  in 
Figure  5-la.  We  note  that  when  b=l,  this  schedule  is  just  a  shortest 
execution  time  first  schedule. 

To  show  that  the  set  of  weight  w.  in  eqs.  (5-3)  and  (5-k)   indeed 
minimizes  the  mean  flow  time  -x,   we  note  that  for  any  q  tasks  assigned  to  the 
fast  processor,  the  associated  weights  are 


fWe  assume  that  m-[rbj-rn  >  [b(r+l)J-[brJ .  Whenwe  have  m-LrbJ-rn  <  [  (r+l)bj-|_rbj; 

the  tasks  in  B  n  are  assigned  to  the  fast  processor.   The  resultant  schedule 
r+1 

is  as  shown  in  Figure  5-lb. 
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V     b  '    •••'  b'  b 

Similarly,    for  any  q  tasks  assigned  to  a  standard  processor,    the  associated 
weights  are 

Hence,    the   smallest  m  weights  are 

12  LbJ  [bj+l     [b]+2  \2bj_ 

h'    h'    ' ' ' '      b  ■*  .    >    '">      '        >)   '    >)  >    *••>       -u     >    *-)    ^t    '  •  •  >    ^> 


. . . ,  r,  r,  . . . ,  r, 


n  n 

lrbJ+1   !■*>]-*  I (r*D* 1    (r+1)  (r+1) 


b   '    b   '  "•'    b 


n 
which  are  exactly  that  given  by  eqs.  (5-3)  and  (5-*+).   The  weighted  sum 
in  eq.  (5-1)  is  minimized  follows  from  the  assignment  of  smaller  weights 
to  longer  tasks. 

The  result  in  Theorem  5.1  can  be  generalized  to  the  case  when  the 
set  of  tasks  is  executed  on  the  system  (P  =   (n, ,  n„,  ...,    n  ;  b  ,  b_,  ...,  b  ), 
Suppose  that  b  >  b  >  .  ,.>b  ,  let 

b. 
d,  =  r±  i  =  1,  2,    .,.,  k-1  (5-6) 

-1-       Vl 

We  have 

Theorem  5»2 

When  a  set  of  tasks  (fT,  \i)   is  executed  on  the  system 

P  =   (n  ,  n  ,  . ..,  n^j  b^  bg,  . ..,  b^  the  weights  w^  wg,  ...,  wffi 

that  minimize  the  mean  flow  times  are  given  by 


nl  nl  [ 
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b  '   b  '    "*'   b  '      b  '   b  '    •"■'   b  '      ***'        b     '     b     '    '"'      b     ' 

Dl     Dl  Dl         1       1  1  Dl         ul  Dl 


n2  X  -1 


£A  ^       UJ      [dj+l  [dj+l        [dJ+2      [dJ+2  d1+2 

y   bj'    ••"   bT/     ~b^~'       bx     '    "••'       b1     '         b1     '        bx     '    "•'  ~b^~' 


*1 


A 


, A  / ^ 

[2dJ      [2d1J  [2d1J  ■  2 

'   ~~ b      '   ~~ b      '    *  *  * '   ~~ b      '  b~~ '   ~ '    ' ' ' '   b~~ ' 
1               1                          1  2        2  2 

nl  r^ 


[d2dxj    [d2dxj  Ld2dxj     [d2j    [d2j  [d2j 


bi  '     \   '  ""     \   '     V    V  ""    V 

"3 


b  '   b  '    '  "'   b   ' 

3      3  3 


(5-6) 


The  proof  of  this  theorem  being  similar  to  that  of  Theorem  5.1  is 
not  repeated  here. 

5.2  Comparison  of  Systems 

We  observed  in  Sec.  3.^-  that  on  the  basis  of  the  worst  case 
performance  of  priority-driven  schedules,  a  processor  of  speed  b  is  more 
desirable  than  b  processors  of  speed  1.  We  now  show  that  the  same 
conclusion  can  be  reached  when  the  performance  of  two  multiprocessor 
systems  are  compared  in  the  basis  of  minimum  mean  flow  time.   Let  us 
consider  two  multiprocessor  systems  (P  =  (l,  n;  b,  l)  and  tf>x   =  (n+b;  1) 
where  b  is  an  integer.   Let  t  and  t'  denote  the  minimum  mean  flow  times 


6k 

when  a  set  of  tasks  {ZT ,    u)  is  executed  on  the  systems  (P   and  <^>', 
respectively.   Again,  we  suppose  that  u(T-.)  >  u(T  )  >  ...  >  u(T  ).  We  have 
Theorem  5.3 

b-1 


,.  .,.-      m 

t<t-  -  —  — b 


J  "<V 


the  bound  is  best  possible  when  — —  is  an  integer  and  u(T-,  )  =  u(Tp) 


u(T  ).   In  this  case 
r  m 

b-1 
T  -  T     2    0 

where  co  is  the  minimum  completion  time  of  the  tasks  ( ^ ,    u)  on  the 

system  (P  . 

Proof. 

Consider  the  optimal  schedule  for  the  set  of  tasks  (^  u)  on 
the  system  (P   =  (l,  n;  b,  l)  shown  in  Figure  5-2a.   The  mean  flow  time  t 
can  be  written  as 

r+1 

t  =  Z  f.  (5-7) 

i=l 

where 

f.  -i  f[(i-l)b+l]  .(T(:Ul)(b+n)+1)  +  [(1-1)^2]  ^(i_l)(b+n)+2) 

+  ...  +  i  b  H(l11)+(i.l)n))  +  i[^Ilb+(1.i)n+1)  +  ^TiD+(i.l)n+2) 

and 
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(i-l)b  tasks  on  Pn 


"(i-3)(b+n)+2  ' 

lTib+(i-l)n  V  T(i-l)(b+n)+J 


H h-V-P — t 


i! 


T  T 


Tib+(i-l)n+l 


Ht—4 — h 


b+1 


I « 


ib+(i-l)n+2 


II  *  I 


b+2 


T. 


(b+n) 


■H-^ 


b+n 


H— : 1 


n+1 


y — 

Block  B. 


(i-1)  tasks  on  each  of 
the  standard  processors 


(a) 


(i-1)  tasks  on  each  processor 
K 


'(i-l)(b+n)+l 


■h-* 1" 


■(i-l)(b+n)+2 


f-^ 1" 


1 


■(l-l)(b+n)+3 


*-^ V 


i(b+n] 


■*M — H 


b+n 


b+n-1 


b+n 


Block  B. 


(b) 


Figure  5-2 
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r 


\   [[rb+1]  ^(Tr(b+n)+1)  +  (rb+2)  ^(Tr(b+n)+2)  +  ...  +  (m-rn)  ^TJ) 


f  = 
r 


if  m-r(n+b)  <  b 


I  \   {(rb+l)  ^(Tr(b+n)+1  +  (rb+2)  n(Tr(b+n)+2)  + 


+  (r+D  b  u(T(r+l)b+rn)} 


+  (r+l)  ^(T(r+l)b+rn+1)  + 


.  +  (r+l)  n(T  ) 


if  m-r(n+b)  >  b 


(5-9) 


Similarly,  an  optimal  schedule  for  the  set  of  tasks  {Zf ,    u)  on  the  system  (p  • 
is  shown  in  Figure  5-2b.  Let 


r+l 
t*  =  E  f.* 
i=l  x 


where 


fi  -   1[^T(i-l)(n+b)+l)  +  ^T(i-l)(n+b)+2)  +  •••  +  ^Ti(nA)>] 


1   —   X^   £- ^   •  •  •  y       X 


and 


fr+l  =  r^(Tr(n+b)+l)  +  ^(Tr(n+b)+2 


)  +  • 


•  +  ^Tm^ 


The 


expressions  for  f.  in  eqs.  (5-8)  and  (5-9)  can  be  written  as 

f,  -  f.  -  £  d-i)  ^(i.1)(,+n)+j)     i  =  1,  2,  ...,  r    (5-10) 
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Similarly, 


a 

f  ^  =  f\i   ~     2  (1-J-)  u(T  -   v  .)  (5-11) 

r+1    r+1   .   -.         b'  ^  r(b+n)+n 
3=1 


where 


a  =  min[b,  m-r(n+b)] 

Substituting  eqs.  (5-10)  and  (5-11)  into  eq.  (5-7)>  we  obtain 

r   b  a     . 

t  =  t'  -  Z   Z   (l-£)  n(T/.  lW_  x  .)  -  Z  (1-J)  u(T  ,.   v  .) 

.-,.-.    b7    (i-l)(b+n)+3y    .  .    by  ^v  r(b+n)+j  ' 
1=1  3=1  3=1  '  u 

Since  u(T  )  <  u(T. )  for  i  =  1,  2,    ...,   m-1,  we  have 

r   b  CK 

T  <  T*  -  Z   Z   (l-£)  u(T  )  -   Z   (l-£)  u(T  ) 
—      .  ,  .  -,    b  ^  m    .  -,    b  ^  m 
i=l  j-l  3=1 

<  t'  -  r  ^  n(Tm) 
But 

I  m  I 
r  =  — r 
Ln+bJ 

Therefore 

T  <T'  "  T  L-?hJ  ^(T  ) 
—      2   L  n+b J  ^  m 

^g  is  an  integer  and  ^(Tj  =  nCT^)  =  ...  =  nC^), 

m  u(T  ) 
m 

; =  w^ 

n+b      0 

and 


when 
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